A GO-driven semantic similarity measure for quantifying the biological relatedness of gene products
نویسندگان
چکیده
Advances in biological experiments, such as DNA microarrays, have produced large multidimensional data sets for examination and retrospective analysis. Scientists however, heavily rely on existing biomedical knowledge in order to fully analyze and comprehend such datasets. Our proposed framework relies on the Gene Ontology for integrating a priori biomedical knowledge into traditional data analysis approaches. We explore the impact of considering each aspect of the Gene Ontology individually for quantifying the biological relatedness between gene products. We discuss two figure of merit scores for quantifying the pair-wise biological relatedness between gene products and the intra-cluster biological coherency of groups of gene products. Finally, we perform cluster deterioration simulation experiments on a well scrutinized Saccharomyces cerevisiae data set consisting of hybridization measurements. The results presented illustrate a strong correlation between the devised cluster coherency figure of merit and the randomization of cluster membership.
منابع مشابه
Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملOn Semantic Similarity and Relatedness for Knowledge-Driven Discovery in Biomedical Data
A great variety of tasks, from word sense disambiguation and document retrieval to assessing the functional similarity of gene products and validating protein-protein interaction networks, depend on the ability to measure the semantic similarity between concepts organized in ontologies. This report is a comprehensive study of classic and recent computational methods measuring semantic relatedne...
متن کاملCurating Extracted Information through the Correlation between Structure and Function
We propose to apply the correlation between structure and function of gene products to curate information automatically extracted from biological literature. This can be achieved by automatically validating extracted information that satisfies the correlation, since it has strong evidence of being correct. We applied a semantic similarity measure (SSM) to identify a correlation between the modu...
متن کاملImproving Information Extraction through Biological Correlation
We present a new method for improving the efficiency of information extraction systems applied to biological literature, using the correlation between structural and functional classifications of gene products. The method evaluates extracted information by checking if gene products from a common family match a common set of biological properties. To evaluate the method, we implemented it in a c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Intelligent Decision Technologies
دوره 3 شماره
صفحات -
تاریخ انتشار 2009